Ira Assent, Marc Wichterich, Thomas Seidl
نویسندگان
چکیده
Today’s abundance of storage coupled with digital technologies in virtually all scientific or commercial applications such as medical and biological imaging or music archives deal with tremendous quantities of images, videos or audio files stored in large multimedia databases. For content-based data mining and multimedia retrieval purposes, suitable similarity models are crucial. Adaptable distance functions are particularly well-suited to match the human perception of similarity. Quadratic Forms (QF) were introduced to capture the notion of inter-feature similarity which sets them apart from the more traditional feature-by-feature measures from e.g. the Euclidean or Manhattan dissimilarity functions. The Earth Mover’s Distance (EMD) was adopted in Computer Vision to better approach human perceptual similarities by allowing feature transformation under a number of restrictions. After recapping the concepts of distancebased similarity search in databases, we familiarize the reader with the flexible building stones behind Quadratic Forms and the EMD. These enable their application to a large variety of multimedia retrieval problems. Unfortunately, the flexibility comes at a cost. Their computation is relatively time-consuming, which severely limits its adoption in interactive multimedia database scenarios. Therefore, we research methods to speed up the retrieval process and show some encouraging recent results to achieve just that via an index-supported multistep algorithm based on new lower bounding approximation techniques.
منابع مشابه
Anticipatory DTW for Efficient Similarity Search in Time Series Databases
Time series arise in many different applications in the form of sensor data, stocks data, videos, and other time-related information. Analysis of this data typically requires searching for similar time series in a database. Dynamic Time Warping (DTW) is a widely used high-quality distance measure for time series. As DTW is computationally expensive, efficient algorithms for fast computation are...
متن کاملMetrische Anpassung der Earth Mover's Distanz zur Ähnlichkeitssuche in Multimedia-Datenbanken
Die zunächst im Bereich Computer Vision vorgestellte Earth Mover’s Distanz (EMD) hat aufgrund Ihrer hohen Flexibilität über eine Vielzahl von Anwendungen Einzug in die inhaltsbasierte Ähnlichkeitssuche in Multimedia-Datenbanken gehalten. Techniken zur effizienten Bearbeitung von EMD-Anfragen wurden in der jüngeren Vergangenheit vorgestellt und basieren zumeist auf einer mehrstufigen Bearbeitung...
متن کاملClasSi: Measuring Ranking Quality in the Presence of Object Classes with Similarity Information
The quality of rankings can be evaluated by computing their correlation to an optimal ranking. State of the art ranking correlation coefficients like Kendall’s τ and Spearman’s ρ do not allow for the user to specify similarities between differing object classes and thus treat the transposition of objects from similar classes the same way as that of objects from dissimilar classes. We propose Cl...
متن کاملLess is More: Non-Redundant Subspace Clustering
Clustering is an important data mining task for grouping similar objects. In high dimensional data, however, effects attributed to the “curse of dimensionality”, render clustering in high dimensional data meaningless. Due to this, recent years have seen research on subspace clustering which searches for clusters in relevant subspace projections of high dimensional data. As the number of possibl...
متن کاملPleiades: Subspace Clustering and Evaluation
Subspace clustering mines the clusters present in locally relevant subsets of the attributes. In the literature, several approaches have been suggested along with different measures for quality assessment. Pleiades provides the means for easy comparison and evaluation of different subspace clustering approaches, along with several quality measures specific for subspace clustering as well as ext...
متن کامل